UP | HOME

Date: [2020-11-28 Sat]

Curriculum Learning

Curriculum learning describes a type of learning in which you first start out with only easy examples of a task and then gradually increase the task difficulty.

I will use a rather small convolutional neural network which tries to classify images into 10 categories.

I start out with six of the ten classes and then introduce one new class with each new epoch. This means that after five epochs all ten classes are in the data set. After that the network keeps training for another 20 epochs to reach a plateauing performance.

When repeating this experiment many times, each time with a random order of the classes, one can observe some runs which perform especially well. If you now look at these highest performing network trainings and plot the class orders used in the runs against the respective class difficulties one can observe a significant negative correlation of -0.27 between the two, F(1,16.43), p<0.001. As class difficulty, I define the network’s performance on the class at the end of a normal training

Now taking these results one step further one can compare the network performance when training with increasing difficulty to training with decreasing difficulty.

The results show a strongly significant difference between the two conditions where the networks trained with increasing difficulty have a lead of around 4% in accuracy.

Even though the normal training has a slight advantage against the continuous learning since the latter has fewer epochs to train on some of the classes because they are only introduced later on, the network trained with gradually increasing difficulty reaches significantly higher performances, F(1,953.43), p<0.001.

It seems as if learning the broad concept on a few easy examples and only later on refining the concept with more complex examples gives the continuously learning network a distinct advantage over a network which needs to grasp the whole concept at once.When thinking about real life this appears quite intuitive, one would not mix advanced calculus into a first grader’s math homework, but with neural networks this seems to be common practice. Of course it is an additional effort to determine individual class difficulties, but for reaching or exceeding benchmarks and deploying an optimal model this additional effort caneasily be worth it.


References


Backlinks


You can send your feedback, queries here